Search | Global Index Medicus

Identification of patterns related to linkage groups or disequilibrium by factor analysis / Identificação de padrões relacionados a grupos de ligação ou de desequilíbrio por análise de fatores

Oliveira, Cristiano Ferreira de; Teixeira, Gabriely; Temoteo, Alex da Silva; Nascimento, Moysés; Cruz, Cosme Damião.

Ciênc. rural (Online) ; 51(5): e20190984, 2021. graf

Article in English | LILACS-Express | LILACS | ID: biblio-1153898

ABSTRACT

ABSTRACT: Empirical patterns of linkage disequilibrium (LD) can be used to increase the statistical power of genetic mapping. This study was carried out with the objective of verifying the efficacy of factor analysis (AF) applied to data sets of molecular markers of the SNP type, in order to identify linkage groups and haplotypes blocks. The SNPs data set used was derived from a simulation process of an F2 population, containing 2000 marks with information of 500 individuals. The estimation of the factorial loadings of FA was made in two ways, considering the matrix of distances between the markers (A) and considering the correlation matrix (R). The number of factors (k) to be used was established based on the graph scree-plot and based on the proportion of the total variance explained. Results indicated that matrices A and R lead to similar results. Based on the scree-plot we considered k equal to 10 and the factors interpreted as being representative of the bonding groups. The second criterion led to a number of factors equal to 50, and the factors interpreted as being representative of the haplotypes blocks. This showed the potential of the technique, making it possible to obtain results applicable to any type of population, helping or corroborating the interpretation of genomic studies. The study demonstrated that AF was able to identify patterns of association between markers, identifying subgroups of markers that reflect factor binding groups and also linkage disequilibrium groups.

RESUMO: Padrões empíricos de desequilíbrio de ligação (LD) podem ser utilizados para aumentar o poder estatístico do mapeamento genético. Este trabalho foi realizado com o objetivo de verificar a eficácia da análise de fatores (AF) aplicada a conjuntos de dados de marcadores moleculares do tipo SNP, visando identificar grupos de ligação e blocos de haplótipos. O conjunto de dados SNPs utilizado foi oriundo de um processo de simulação de uma população F2, contendo 2000 marcas com informações de 500 indivíduos. A estimação das cargas fatoriais (loadings) da AF foi feita de duas formas, considerando a matriz de distâncias entre os marcadores (A) e considerando a matriz de correlação (R). O número de fatores (k) a ser utilizado foi estabelecido com base no gráfico scree-plot e com base na proporção da variância total explicada. Os resultados indicam que as matrizes A e R conduzem a resultados similares. Com base no scree-plot considerou-se k igual a 10 e os fatores interpretados como sendo representativos dos grupos de ligação. O segundo critério conduziu a um número de fatores igual a 50, e os fatores interpretados como sendo representativos dos blocos de haplótipos. Isto mostra o potencial da técnica que permite obter resultados aplicáveis a qualquer tipo de população, corroborando a interpretação de estudos genômicos. O trabalho demonstrou que a AF foi capaz de identificar padrões de associação entre marcadores, identificando subgrupos de marcadores que refletem grupos de ligação fatorial e também grupos de desequilíbrio de ligação.

Otimização do mapeamento genético vegetal de populações duplo-haplóide via simulação computacional / Optimization of vegetal genetic mapping of double-haploid populations via computational simulation

Brito, Silvan Gomes de; Melo Filho, Péricles de Albuquerque; Silva, Mairykon Coelho da; Arcelino, Eliane Cristina; Coutinho, Alisson Esdras; Neder, Diogo Gonçalves.

Biosci. j. (Online) ; 30(3 Supplement): 311-317, 2014. tab

Article in Portuguese | LILACS | ID: biblio-947751

ABSTRACT

O mapeamento genético é um passo necessário para entender a organização genômica e a relação entre genes e o fenótipo. Um dos principais problemas está em encontrar a ordem, o espaçamento correto dos marcadores em um mapa genético, assim como o número de indivíduos a compor uma população. Deste modo, o objetivo deste estudo foi avaliar o nível de saturação do genoma e o tamanho ideal de populações simulada duplo-haplóide para a construção de mapas de ligação mais confiáveis por meio de simulação computacional. Foram simulados genomas parentais e populações duplo-haplóide considerando marcadores moleculares do tipo dominante, espaçados de forma equidistante a 5, 10 e 20 cM. Os tamanhos das populações geradas foram de 100, 200, 300, 500, 800 e 1000 indivíduos, com dez grupos de ligação cada e 100 repetições por amostra. Procedeu-se a análise de todas as populações geradas obtendo um genoma analisado o qual foi comparado com o genoma simulado inicialmente. Observou-se que o tamanho ideal de populações duplo-haplóide para mapeamento genético foi de no mínimo 200, 500 e 1000 indivíduos para genomas saturados, medianamente saturados e com baixa saturação. Populações de mesmo tamanho tendem a produzir mapas com maior acurácia em níveis de saturação do genoma mais elevados.

Genetic mapping is a necessary step to understand the genomic organization and the relationship between genes and phenotypes. A major problem is to find the order, the correct spacing of the markers in a genetic map, and the number of individuals to compose a population. Thus, the objective of this study was to evaluate the saturation level of the genome and the optimal size of simulated double-haploid populations for the construction reliable linkage maps by means of computer simulation. Parental genomes and double-haploid populations were simulated considering dominant molecular markers, spaced equidistantly at 5, 10 and 20 cM. The sizes of the generated populations were 100, 200, 300, 500, 800 and 1000 individuals, with ten linking groups and 100 replicates per sample. It was proceeded the analysis of all generated population obtaining a genome which was compared with the first simulated genome. It was observed that the optimal size of double-haploid populations for genetic mapping has been at least 200, 500 and 1000 individuals for saturated genomes, medium unsaturated and low saturation. Populations of the same size tend to produce maps with greater accuracy in higher levels of genome saturation.

Subject(s)

Chromosome Mapping , Genome , Plant Breeding

ABSTRACT

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL